Automatic Determination of K in Distributed K-Means Clustering

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Persistent K-Means: Stable Data Clustering Algorithm Based on K-Means Algorithm

Identifying clusters or clustering is an important aspect of data analysis. It is the task of grouping a set of objects in such a way those objects in the same group/cluster are more similar in some sense or another. It is a main task of exploratory data mining, and a common technique for statistical data analysis This paper proposed an improved version of K-Means algorithm, namely Persistent K...

متن کامل

Distributed k-Means and k-Median Clustering on General Topologies

This paper provides new algorithms for distributed clustering for two popular center-based objectives, k-median and k-means. These algorithms have provable guarantees and improve communication complexity over existing approaches. Following a classic approach in clustering by [13], we reduce the problem of finding a clustering with low cost to the problem of finding a coreset of small size. We p...

متن کامل

K-Means Clustering with Distributed Dimensions (Supplement)

1. Proof of Theorem 4 Proof. It is easy to verify the communication cost, and thus we focus on the proof for the approximation ratio below. Similar to the proof of Theorem 1, the grid G is rewritten as {g1, · · · , gm} where m = (k + z) , and for each gj , its corresponding intersection ⋂T l=1Mlil is rewritten as Sj . Meanwhile, we denote the index-set indicating the outliers obtained by our al...

متن کامل

Distributed Document Clustering Using K-Means

Document clustering, one of the traditional data mining techniques, is an unsupervised learning paradigm where clustering methods try to identify inherent grouping of the text documents.The importance of document clustering emerges from the massive volumes of textual documents created. Also, with more and more development of information technology, data set in many domains is reaching beyond pe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Procedia Computer Science

سال: 2019

ISSN: 1877-0509

DOI: 10.1016/j.procs.2020.01.050